Optimizing Synonym Extraction Using Monolingual and Bilingual Resources
نویسندگان
چکیده
Automatically acquiring synonymous words (synonyms) from corpora is a challenging task. For this task, methods that use only one kind of resources are inadequate because of low precision or low recall. To improve the performance of synonym extraction, we propose a method to extract synonyms with multiple resources including a monolingual dictionary, a bilingual corpus, and a large monolingual corpus. This approach uses an ensemble to combine the synonyms extracted by individual extractors which use the three resources. Experimental results prove that the three resources are complementary to each other on synonym extraction, and that the ensemble method we used is very effective to improve both precisions and recalls of extracted synonyms.
منابع مشابه
Word Alignment with Synonym Regularization
We present a novel framework for word alignment that incorporates synonym knowledge collected from monolingual linguistic resources in a bilingual probabilistic model. Synonym information is helpful for word alignment because we can expect a synonym to correspond to the same word in a different language. We design a generative model for word alignment that uses synonym information as a regulari...
متن کاملMutual Bilingual Terminology Extraction
This paper describes a novel methodology to perform bilingual terminology extraction, in which automatic alignment is used to improve the performance of terminology extraction for each language. The strengths of monolingual terminology extraction for each language are exploited to improve the performance of terminology extraction in the other language, thanks to the availability of a sentence-l...
متن کاملThe CQC Algorithm: Cycling in Graphs to Semantically Enrich and Enhance a Bilingual Dictionary
Bilingual machine-readable dictionaries are knowledge resources useful in many automatic tasks. However, compared to monolingual computational lexicons like WordNet, bilingual dictionaries typically provide a lower amount of structured information such as lexical and semantic relations, and often do not cover the entire range of possible translations for a word of interest. In this paper we pre...
متن کاملBilingual Synonym Identification with Spelling Variations
This paper proposes a method for identifying synonymous relations in a bilingual lexicon, which is a set of translation-equivalent term pairs. We train a classifier for identifying those synonymous relations by using spelling variations as main clues. We compared two approaches: the direct identification of bilingual synonym pairs, and the merger of two monolingual synonyms. We showed that our ...
متن کاملSynonymous Collocation Extraction Using Translation Information
Automatically acquiring synonymous collocation pairs such as and from corpora is a challenging task. For this task, we can, in general, have a large monolingual corpus and/or a very limited bilingual corpus. Methods that use monolingual corpora alone or use bilingual corpora alone are apparently inadequate because of low precision or low coverage. I...
متن کامل